12 research outputs found
Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables
Clustering analysis is one of the most widely used statistical tools in many
emerging areas such as microarray data analysis. For microarray and other
high-dimensional data, the presence of many noise variables may mask underlying
clustering structures. Hence removing noise variables via variable selection is
necessary. For simultaneous variable selection and parameter estimation,
existing penalized likelihood approaches in model-based clustering analysis all
assume a common diagonal covariance matrix across clusters, which however may
not hold in practice. To analyze high-dimensional data, particularly those with
relatively low sample sizes, this article introduces a novel approach that
shrinks the variances together with means, in a more general situation with
cluster-specific (diagonal) covariance matrices. Furthermore, selection of
grouped variables via inclusion or exclusion of a group of variables altogether
is permitted by a specific form of penalty, which facilitates incorporating
subject-matter knowledge, such as gene functions in clustering microarray
samples for disease subtype discovery. For implementation, EM algorithms are
derived for parameter estimation, in which the M-steps clearly demonstrate the
effects of shrinkage and thresholding. Numerical examples, including an
application to acute leukemia subtype discovery with microarray gene expression
data, are provided to demonstrate the utility and advantage of the proposed
method.Comment: Published in at http://dx.doi.org/10.1214/08-EJS194 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Variable selection in penalized model-based clustering via regularization on grouped parameters
Summary: Penalized model-based clustering has been proposed for high-dimensional but small sample-sized data, such as arising from genomic studies; in particular, it can be used for variable selection. A new regularization scheme is proposed to group together multiple parameters of the same variable across clusters, which is shown both analytically and numerically to be more effective than the conventional L1 penalty for variable selection. In addition, we develop a strategy to combine this grouping scheme with grouping structured variables. Simulation studies and applications to microarray gene expression data for cancer subtype discovery demonstrate the advantage of the new proposal over several existing approaches
Functional group-based linkage analysis of gene expression trait loci-3
Phosphoinositide-mediated signaling) and 5 (regulation of cyclin dependent protein kinase activity).<p><b>Copyright information:</b></p><p>Taken from "Functional group-based linkage analysis of gene expression trait loci"</p><p>http://www.biomedcentral.com/1753-6561/1/S1/S117</p><p>BMC Proceedings 2007;1(Suppl 1):S117-S117.</p><p>Published online 18 Dec 2007</p><p>PMCID:PMC2367612.</p><p></p
Functional group-based linkage analysis of gene expression trait loci-0
Ed signaling; 3) GTP biosynthesis; 4) purine nucleotide biosynthesis; 5) regulation of cyclin dependent protein kinase activity; 6) meiosis; 7) mRNA-nucleus export; 8) cholesterol metabolism; 9) biosynthesis; and 10) epidermis development.<p><b>Copyright information:</b></p><p>Taken from "Functional group-based linkage analysis of gene expression trait loci"</p><p>http://www.biomedcentral.com/1753-6561/1/S1/S117</p><p>BMC Proceedings 2007;1(Suppl 1):S117-S117.</p><p>Published online 18 Dec 2007</p><p>PMCID:PMC2367612.</p><p></p
Functional group-based linkage analysis of gene expression trait loci-4
Ed signaling; 3) GTP biosynthesis; 4) purine nucleotide biosynthesis; 5) regulation of cyclin dependent protein kinase activity; 6) meiosis; 7) mRNA-nucleus export; 8) cholesterol metabolism; 9) biosynthesis; and 10) epidermis development.<p><b>Copyright information:</b></p><p>Taken from "Functional group-based linkage analysis of gene expression trait loci"</p><p>http://www.biomedcentral.com/1753-6561/1/S1/S117</p><p>BMC Proceedings 2007;1(Suppl 1):S117-S117.</p><p>Published online 18 Dec 2007</p><p>PMCID:PMC2367612.</p><p></p
Pairwise correlations of the ten functional groups with highest mean heritability
<p><b>Copyright information:</b></p><p>Taken from "Functional group-based linkage analysis of gene expression trait loci"</p><p>http://www.biomedcentral.com/1753-6561/1/S1/S117</p><p>BMC Proceedings 2007;1(Suppl 1):S117-S117.</p><p>Published online 18 Dec 2007</p><p>PMCID:PMC2367612.</p><p></p
Penalized mixtures of factor analyzers with application to clustering high-dimensional microarray data
Motivation: Model-based clustering has been widely used, e.g. in microarray data analysis. Since for high-dimensional data variable selection is necessary, several penalized model-based clustering methods have been proposed tørealize simultaneous variable selection and clustering. However, the existing methods all assume that the variables are independent with the use of diagonal covariance matrices